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Voice Coding Method, Voice Coding Apparatus^ and Voice 
Decoding Apparatus 

Background of the Invention 
5 Field of the Invention 

The present invention relates to a voice 
coding/decoding technology based on A-b-s (Analysis- 
by-Synthesis ) vector quantization. 

10 Description of the Related Art 

The voice coding system represented by the CELP 
(Code Excited Linear Prediction) coding system based 
on the A-b-s vector quantization is applied when the 
transmission rate of a PCM voice signal is compressed 

15 from, for example, 64 Kbits/sec (kilobits/seconds) to 
approximately 4 through 16 kbits/sec. The voice 
coding system is demanded as a system for compressing 
information while maintaining voice quality in an in- 
house communications system, a digital mobile radio 

20 system, etc. 

FIG. 1 shows the conventional A-b-S vector 
quantization system. 51 is a code book, 52 is a gain 
unit, 53 is a linear prediction synthesis filter, 54 
is a subtracter, and 55 is an error power evaluation 

25 unit. 
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In an A-b-S vector quantization coder, the gain 
unit 52 first multiplies the code vector C read from 
the code book 51 by a gain g. Then, the linear 
prediction synthesis filter 53 inputs the above 
5 described the scaled code vector, and outputs a 
reproduced signal gAC. Then, the subtracter 54 
subtracts the reproduced signal gAC from an input 
signal X, thereby outputting an error signal E which 
indicates the difference between them. Furthermore, 

10 the error power evaluation unit 55 computes an error 
power according to an error signal E. The above 
described process is performed on all code vectors C 
in the code book 51 with optimal gains g, the index 
of the code vector C and the gain g which generate the 

15 smallest error power are computed, and they are 
transmitted to a decoder. 

In an A-b-S vector quantization decoder, the code 
vector C corresponding to the index transmitted from 
the coder is read from the code book 51. Then, the 

20 gain unit 52 scales the code vector C by the gain g 
transmitted from the coder. Then, the linear 
prediction synthesis filter 53 inputs the scaled code 
vector, and outputs the decoded regenerated signal 
gAC. The decoder does not require the subtracter 54 

25 and the error power evaluation unit 55. 
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As described above, in the A-b-S vector 
quantization coder, an analyzing process is performed 
while a synthesizing (decoding) process is performed 
on a code vector C 
5 FIG. 2 shows a typical conventional CELP system 

based on the above described A-b-S vector quantization 
system. 

In this CELP system, two types of code books, 
that is, an adaptive code book corresponding to a 

10 periodic (pitch) sound source and a fixed code book 
corresponding to a noisy (random) sound source. 
According to this system, an A-b-S vector quantizing 
process mainly for the periodic voice (voiced sound, 
etc. ) and a succeeding A-b-S vector quantizing process 

15 mainly for a noisy voice (unvoiced sound, background 
sound, etc. ) are sequentially performed based on 
respective code books. 

In FIG. 2, 51 is a fixed code book, 62 is an 
adaptive code book, 63 and 64 are gain units, 65 and 

20 66 are linear prediction synthesis filters, 67 and 68 
are error power evaluation units, and 69 and 70 are 
subtracters. Each of the fixed code book 61 
corresponding to a random sound source and the 
adaptive code book 62 corresponding to a pitch sound 

25 source are contained in the memory. The gain units 63 



and 64, the linear prediction synthesis filters 65 and 
66, the error power evaluation units 67 and 68, and 
the subtracters 69 and 70 can be realized by operation 
elements such as a DSP (digital signal processor), 
5 etc. 

In the CELP coder with the above described 
configuration, the portion comprising the adaptive 
code book 62, the gain unit 64, the linear prediction 
synthesis filter 66, the subtracter 70, and the error 

10 power evaluation unit 68 outputs a transmission 
parameter effective for periodic voice. P indicates 
an adaptive code vector output from the adaptive code 
book, b indicates a gain in the gain unit 64, and A 
indicates the transmission characteristic of the 

15 linear prediction synthesis filter 66. 

The coding process performed by this portion is 
based on the same principle as the coding process 
performed by the code book 51, the gain unit 52, the 
linear prediction synthesis filter 53, the subtracter 

20 54, and the error power evaluation unit 55. However, 
a sample in the adaptive code book 62 adapt ively 
changes by the feedback of a previous excitation 
signal. The decoder performs a process similar to the 
process performed by the decoding process by the code 

25 book 51, the gain unit 52, and the linear prediction 
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synthesis filter 53 described above by referring to 
FIG. 1. However, in this case, a sample in the 
adaptive code book 62 also changes adaptively by the 
feedback of a previous excitation signal. 
5 On the other hand, the portion comprising the 

fixed code book 61, the gain unit 63, the linear 
prediction synthesis filter 65, the subtracter 69, and 
the error power evaluation unit 67 outputs a 
transmission parameter effective for the noisy signal 

10 X' output by the subtracter 70 subtracting the optimum 
reproduced signal bAP output by the linear prediction 
synthesis filter 66 from the input signal X. The 
coding process by this portion is based on the same 
principle as the coding process by the code book 51, 

15 the gain unit 52, the linear prediction synthesis 
filter 53, the subtracter 54, and the error power 
evaluation unit 55. In this case, the fixed code book 
61 preliminarily stores a fixed sample. The decoder 
performs a process similar to the process performed 

20 by the decoding process by the code book 51, the gain 
unit 52, and the linear prediction synthesis filter 
53 described above by referring to FIG. 1. 

The fixed code book 61 preliminarily stores a 
random code vector C corresponding to a fixed sample 

25 value. Therefore, for example, assuming that a vector 
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dimension length is 40 (corresponding to the number 
of samples in the period of 5 msec (milliseconds) when 
the sampling frequency is 8 kHz ) , and that the number 
of vector: code book size is 1024, the fixed code book 
5 61 requires the memory capacity of 40 k (kilo) words. 

That is, a large memory capacity is required by 
the fixed code book 61 to independently store all 
sample values. This is a big problem to be solved 
when the CELP voice codec is realized. 

10 To solve this problem, an ACELP (Algebraic Code 

Excited Linear Prediction) system has been suggested 
to successfully perform the code book searching 
process in an algebraic method by arranging a small 
number of non-zero sample values at fixed positions 

15 (refer to J. P. Adoul et al. 'Fast CELP coding based 
on algebraic codes' Proc. IEEE International 
conference on acoustics speech and signal processing, 
pp. 1957 - 1960 (April, 1987)). 

FIG. 3 shows the configuration of the 

20 conventional ACELP system using an algebraic code 
book. An algebraic code book 71 corresponds to the 
fixed code book 61 shown in FIG. 2, a gain unit 72 
corresponds to the gain unit 63 shown in FIG. 2, a 
linear prediction synthesis filter 73 corresponds to 

25 the linear prediction synthesis filter 65 shown in 



FIG. 2, a subtracter 74 corresponds to the subtracter 
69 shown in FIG. 2, and an error power evaluation unit 
75 corresponds to the error power evaluation unit 67 
shown in FIG. 2. In the A-b-S process shown in FIG. 
5 3, as in the processes described by referring to FIG. 
1 or 2, an A-b-S process is performed using the code 
vector Ci generated from the algebraic code book 71 
corresponding to an index i, and a gain g. 

In this ACELP system, the required amount of 

10 operations and memory can be considerably reduced by 
limiting the amplitude value and position of a non- 
zero sample. At this time, for example, as shown in 
FIG. 4, the N-dimensional M-size algebraic code book 
71 storing code vectors Cg, C^, C^.-^ is provided. 

15 However, since the number of non-zero samples in a 
frame is fixed and the non-zero samples are arranged 
at equal intervals, each of the code vectors Cq, C-^, 
C^_i can be generated in an algebraic method. In 
the example shown in FIG. 4, the sample position of 

20 each of the four non-zero samples ig, i^, ±2, and 13 is 
standardized, and the amplitude value is ±1.0. The 
amplitude of the sample position other than the four 
sample positions is assumed to be zero. 

As shown on the right of the algebraic code book 

25 71 shown in FIG- 4, the sample value pattern of the 
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code vector corresponding to ±q, ±i, ±2, and ±3 depends 
on the sample positions Iq, ii, ±2' ^3 within the 

amplitude of ±1 excluding the sample position having 
the amplitude of zero, for example, the pattern 
5 corresponding to the code vector Cq (0, ...0, +1, 0, 
0, -1, 0, ..,0, +1, 0, ...,0, -1, 0, ...). That 
is, for the code vector having, as elements, a total 
of N samples of four non-zero samples and N - 4 zero 
samples, each of the four non-zero samples i^ (n = 0, 

10 1, 2, 3) can be expressed by a total of K + 1 bits, 
that is, 1 bit for amplitude information (the absolute 
value of the amplitude is fixed to 1, and indicates 
only the polarity), and K bits for the position 
information m„ specifying one of 2^ candidates. 

15 The position of a non-zero sample is standardized 

by the G.729 or G. 723.1 of the ITU-T (International 
Telecommunication Union -Telecommunication 
Standardization Secter). 

For example, in the table 77 shown in FIG. 4 

20 corresponding to the standard G.729, each position 
information mo through about non-zero samples ±0 
through ±2 in 40 samples corresponding to 1 frame has 
candidates at 8 positions. One position can be 
specified by 3 bits. The position information mj 

25 about a non-zero sample ±3 has candidates at 16 



positions, and can be expressed by 4 bits to specify 
one of the positions. Each piece of the amplitude 
information Sq through S3 about the non-zero samples 
Iq through 13 can be expressed by 1 bit because the 
5 absolute value of each amplitude is fixed to 1.0, and 
the polarity is represented. Therefore, in G.729, the 
non-zero samples io through ±3 can be formed by 17-bit 
data comprising the amplitude information Sq through 
S3 each being formed by 1 bit and the position 

10 information mo through m3 each being formed by 3 or 4 
bits as shown by 76 in FIG- 4. 

In the table 78 shown in FIG. 4 corresponding to 
the standard 723.1, each position candidate of the 
non-zero samples io through 13 is determined such that 

15 the position is assigned to every second sample in the 
non-zero samples. Thus, each piece of the position 
information through m3 about the non-zero samples 
ig through 13 can be expressed by 3 bits. As in the 
standard G.729, each piece of the amplitude 

20 information Sq through S3 about the non-zero samples 
io through 13 can be expressed by 1 bit. As described 
above, in G. 723.1, the non-zero samples Iq through ±j 
can be formed by 16-bit data comprising the amplitude 
information Sq through S3 each being formed by 1 bit 

25 and the position information mo through m3 each being 
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formed by 3 bits as shown by 76 in FIG. 4. 

For example, when the i-th coded word has the 
value s^„,m^^ (where n = 0, 1, 2, 3), the coded word 
sample c^ (n) can be defined by the following 
5 equation. 

cHn) = s^o&{n-m\) + s ^^^5 {n-m \) 



where s^^ indicates the amplitude information 
about a non-zero sample, and m^^ indicates the 
position information about a non-zero sample. In 
addition, 6 ( ) indicates a delta function, and the 
following equations exist. 

6 (n) = 1 for n = 0 

6 (n) = 0 for n ^ 0 

In addition, the error power can be expressed 
by the following equation using the input signal shown 
in FIG. 3, the gain g, the code vector C^, and the 
matrix H of the impulse response of the linear 
prediction synthesis filter 73- 
= (X - gHCJ^ 2 

The evaluation function argmax ( Fi ) for obtaining 
the minimum error power can be expressed by the 
following equation. 



argmax ( Fi ) = [(X^HCJ^ / {(HCJ^ (HCJ}] 3 

where assuming that; : 

X^H = D = d(i) 4, and 

H^H = # = (|) ( i , j ) 5 

the evaluation function argmax (fi) expressed by 
the equation 3 can be expressed by the following 
equation. 

argmax (Fi) = C(D^CJ^ / {(CJ^ #Ci}] 6 

where the characters in the upper case indicate 
vectors. 

Since the above described equations 4 and 5 
contain no elements of the code vector C^, an 
arithmetic operation can be preliminarily performed 
even when the number M of patterns (size) of a coded 
word is large. Therefore, a higher-speed operation 
can be performed by the equation 6 than by the 
equation 3 . 

The process relating to the code vector is 
performed on four samples having the amplitude of ±1-0 
as described above. Accordingly, the denominator and 
the numerator of the equation 6 can be respectively 
obtained by the following equations 7 and 8. 
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(7) 



(C^) ^$C^ = Si=o<l>(^i'-mi) 

+ 2Si=oSj=i.iS^s^<i)(in^,inj-) 



where S^^^o indicates the accumulation from i=0 
through i=3. 

5 The amount of operations by the equations 7 and 

8 does not depend on the parameter (number of 
dimensions) N, and is small. Therefore, even if 
operations are performed the number of times 
corresponding to the number M of coded word patterns, 

10 the amount of the operations is not large- Therefore, 
with the configuration using the algebraic code book 
71 shown in FIG. 3, the amount of operations can be 
reduced much more than with the configuration using 
the fixed code book 61 shown in FIG. 2. In addition, 

15 each code vector output from the algebraic code book 
71 can be generated in an algebraic method according 
to the amplitude information (polarity information) 
and the position information. As a result, it is not 
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necessary to store each code vector in the memory, 
thereby considerably reducing the requirements of the 
memory . 

In the above described ACELP system, the 
5 requirements of the memory and the amount of 
operations can be successfully reduced. However, 
since the number of non-zero samples in a frame is 
fixed to four, and the restrictions are placed such 
that the positions of samples can be set at equal 

10 intervals, there is the problem that a bit rate 
representing the code vector index is determined 
according to two parameters, that is, the frame length 
parameter and the non-zero sample number parameter, 
thereby requiring a comparatively large number of bits 

15 to express a code vector index. 

For example, when one frame contains 40 samples 
according to the standard G.729 of the ITU-T, a total 
of 17 bits are used as a code vector index as shown 
in the table 77 shown in FIG. 4. The number of the 

20 bits corresponds to 42% of the total transmission 
capacity (8 kbits/sec, 80 bits/10 msec) prescribed by 
G.729. 

If one frame contains 80 samples, the number of 
bits required to express the position information 
25 about a non-zero sample is larger by one than in the 
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above described case. Therefore, a total of 21 bits 
are used as a code vector index. The number of bits 
corresponds to 62.5% of the total transmission 
capacity prescribed by G.729, and is much larger than 
in one frame containing 40 samples - 

Normally, to realize a very low bit rate voice 
CODEC at about 4 kbits/sec, a frame length should be 
extended. However, when the above described 

conventional ACELP system is applied to this 
requirement, there arises the problem of a 
considerable increase of the transmission bit rate of 
a code vector index. That is, the conventional acelp 
system has the problem that it interrupts a demand to 
lower a bit rate by decreasing the number of parameter 
transmission bits per unit time through higher 
transmission efficiency. 

In addition to this problem, the conventional 
ACELP system also has the problem that the ability to 
identify a pitch period shorter than a frame length 
is lowered when the frame length is extended. 

Summary of the Invention 

The present invention has been developed based 
on the above described background, and aims at setting 
a constant transmission amount of a code vector index 
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and maintaining the identifying ability for a pitch 
period in a voice coding/decoding system based on the 
A-b-S vector quantization using a sound source coded 
word formed only by non-zero amplitude values. 
5 The present invention relates to a voice coding 

technology based on the analysis-by-synthesis vector 
quantization using a code book in which sound source 
code vector are formed only by non-zero amplitude 
values, and variably controls the sample position of 

10 a non-zero amplitude value using an index and a 
transmission parameter indicating a feature amount of 
voice. In this case, a lag value corresponding to a 
pitch period can be used as a transmission parameter. 
Furthermore, a pitch gain value can also be used- 

15 Corresponding to a lag value or a pitch gain value, 
the sample position of a non-zero amplitude value can 
be redesigned within a period corresponding to the lag 
value . 

With the above described configuration, the 
20 position of a non-zero sample output from a code book 
in the A-b-S vector quantization can be changed and 
controlled using an index and a transmission parameter 
indicating the feature amount of voice such as a lag 
value, a pitch gain, etc. As a result, according to 
25 the present invention, it is not necessary to increase 
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the number of necessary transmission bits even when 
a frame length is extended, thereby successfully 
avoiding the deterioration of the transmission 
efficiency. 

5 In addition, the present invention has the merit 

that the pitch periodicity can be easily reserved with 
a pitch emphasizing process, etc even in a longer 
frame . 

10 Brief Description of the Drawings 

Other objects and features of the present 
invention can be easily understood by one of ordinary 
skill in the art from the descriptions of preferred 
embodiments by referring to the attached drawings in 
15 which: 

FIG. 1 shows the conventional A-b-S vector 
quantization ; 

FIG. 2 shows the conventional CELP system; 

FIG. 3 shows the configuration according to the 
20 conventional ACELP system; 

FIG. 4 shows the outline of the ACELP system; 

FIG. 5 shows the principle of the present 
invention (coding search process); 

FIG. 6 shows the principle of the present 
25 invention ( regenerating process on the decoding side ) ; 
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FIG. 7 shows -the first preferred embodiment 
according to the present invention (coding search 
process) ; 

FIG. 8 shows the first preferred embodiment 
5 according to the present invention (regenerating 
process on the decoding side); 

FIG. 9 is a flowchart of the first preferred 
embodiment according to the present invention; 

FIGS. lOA through IOC show the conf iguration- 
10 variable code book using a lag value according to the 
preferred embodiment of the present invention; 

FIG. 11 shows the non-zero sample position 
corresponding to a lag value according to the 
preferred embodiment of the present invention; 
15 FIG. 12 shows the pitch emphasizing process; 

FIG. 13 shows the second preferred embodiment 
according to the present invention (coding search 
process ) ; 

FIG. 14 shows the second preferred embodiment 
20 according to the present invention (regenerating 
process on the decoding side); 

FIG. 15 is a flowchart according to the second 
preferred embodiment of the present invention; and 

FIGS. 16A through 16C show waveform examples of 
25 each signal. 
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Description of the preferred embodiments 

The preferred embodiment of the present invention 
are described below by referring to the attached 
drawings . 

5 FIGS. 5 and 6 show the principle of the present 

invention. 1 and 1' are configuration variable code 
books, 2 and 2' are gain units, 3 and 3' are linear 
prediction synthesis filters, 4 is a subtracter, 5 is 
an error power evaluation unit. 
10 The configuration variable code books 1 and 1 ' 

correspond to an algebraic code book for outputting 
a code vector comprising, for example, a plurality of 
non-zero samples, and has the function of 
reconstructing itself by controlling the position of 
15 non-zero samples based on an index i and a 
transmission parameter p such as a pitch period (lag 
value), etc. At this time, the configuration variable 
code books 1 and 1' variably control the position of 
non-zero samples without changing the number o£ non- 
20 zero samples. Thus, the number of necessary bits for 
transmission of a code vector index can be prevented 
from increasing. 

In the coder with the principle configuration 
according to the present invention shown in FIG. 5, 
25 after the position of a non-zero sample is controlled 
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according to an index i and a transmission parameter 
p, the gain unit 2 first scales the code vector Ci 
output from the configuration variable code book 1 by 
the gain g. Then, the linear prediction synthesis 
5 filter 3 inputs the above described scaled code vector 
, and outputs a reproduced signal gACi . Then, the 
subtracter 4 subtracts the above described reproduced 
signal gACi from the input signal X, and outputs the 
difference between them as an error signal E. Next, 

10 the error power evaluation unit 5 computes error power 
according to an error signal E. The above described 
process is performed on all code vectors Ci output 
from the configuration variable code book 1, and 
plural types of gains g, computes the index i of the 

15 code vector Ci and the gain g with which the above 
described error power is the smallest, and they are 
transmitted to the decoder. 

In the decoder with the principle configuration 
according to the present invention shown in FIG. 6, 

20 a parameter separation unit 6 separates each parameter 
from received data transmitted from the coder. Then, 
the configuration variable code book 1 ' outputs a code 
vector Ci according to the index i and the 
transmission parameter p in the above described 

25 separated parameters. Next, the gain unit 2' scales 
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the above described code vector Ci by the gain g 
separated by the parameter separation unit 6. Then, 
the linear prediction synthesis filter 3' inputs the 
scaled code vector, and outputs the decoded 

5 regenerated signal gAC. A linear prediction 

parameter, not shown in FIG. 6, is provided for the 
linear prediction synthesis filter 3 ' by the parameter 
separation unit 6. 

Various transmission parameters p in the 

10 configuration shown in FIGS. 5 and 6 can be selected 
corresponding to the characteristics of a voice 
signal. For example, a pitch period (lag value), a 
gain, etc. can be adopted. 

FIGS. 7 and 8 shows the first embodiment 

15 according to the principle configuration shown in 
FIGS. 5 and 6. 11 and 11' are configuration variable 
code books, 12 and 12' are gain units, 13 and 13' are 
linear prediction synthesis filters, 14 is a 
subtracter, 15 is an error power evaluation unit, 16 

20 is a non-zero sample position control unit, 17 is a 
pitch emphasis filter, and 18 is a parameter 
separation unit. 

As shown at the middle and lower parts in FIG. 
7 (and in FIG. 8), the configuration variable code 

25 books 11 and 11' comprise a non-zero sample position 
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control unit 16 for inputting an index i and a pitch 
period (lag value) 1 which is a transmission 
parameter; and a pitch emphasis filter 17 for 
inputting an output signal of the non-zero sample 
5 position control unit 16 and a pitch period (lag 
value) 1. The non-zero sample position control unit 
16 does not change the number of non-zero samples, but 
variably controls the position of a non-zero sample 
based on the pitch period (lag value) 1. The pitch 
10 emphasis filter 17 is a feedback filter for 
synthesizing a sample longer than the length 
corresponding to a lag value from a previous lag value 
when the lag value is shorter than the length of a 
frame . 

15 The function of each unit shown in FIGS. 7 and 

8 can also be realized by operation elements such as 
a DSP (digital signal processor), etc. 

In the conventional ACELP system, non-zero 
samples have been assigned such that they can be 

20 stored in the entire range of a frame depending on the 
frame length. However, when a lag value corresponding 
to the pitch period is smaller than the length of a 
frame, a sample longer than the length corresponding 
to the lag value can be designed to be synthesized 

25 from a previous lag value using a feedback filter. 



22 

In this case, it is wasteful to assign non-zero 
samples in a range larger than one corresponding to 
the lag value in a frame. 

According to the present embodiment, the non-zero 
5 sample position control unit 15 assigns a non-zero 
sample within a pitch period, that is the range of the 
lag value. Simultaneously, when the lag value exceeds 
the value corresponding to a half of the frame length, 
the non-zero sample position control unit 16 removes 

10 some of the non-zero samples, assigned to the last 
half having a smaller influence of the feedback 
process by the pitch emphasis filter 17, in the non- 
zero samples assigned in a pitch periode, and variably 
controls the positions of the non-zero samples. Thus, 

15 even if the lag value and the frame length change, the 
constant number of non-zero samples can be maintained, 
thereby preventing the number of necessary bits in 
atransmitting code vector index from increasing. 

First, the entire operation of the configuration 

20 according to the first embodiment shown in FIGS. 7 and 
8 is the same as the operation of the principle 
configuration shown in FIGS. 5 and 6. 

FIG. 9 is a flowchart of the operations process 
performed by the non-zero sample position control unit 

25 16 designed in the configuration variable code books 
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11 and 11' shown in FIGS. 7 and 8. In the example 
described below, one frame contains 80 samples (8 kHz 
sampling), the number of non-zero samples is 4, the 
lag value equals 20 samples (400 Hz) through 147 
5 samples (54.4 Hz), and the index transmission bit 
equals 17 bits. 

First, the position of a non-zero sample is 
initialized (step Al in FIG. 9). In this step, non- 
zero sample positions i = 0 through 39 are set at 
10 equal intervals for the array data smp_pos [i] (0 < 
i < 40 ) containing 40 elements. 

Then, a lag value corresponding to an input pitch 
period is determined. The lag value is not shown in 
FIG. 7 or 8, but can be computed in the A-b-S process 
15 (corresponding to the configuration at the upper part 
of FIG. 2), to be performed before the ACELP process, 
using an adaptive code book. 

First, it is determined whether or not the lag 
value is smaller than the first set value of 40 (step 
20 A2 in FIG. 9). If the determination is YES, then the 
process in step A6 shown in FIG. 9 is performed, and 
each non-zero sample position is entered. 

As a result, when the lag value corresponding to 
the pitch period is equal to or smaller than 40, then 
25 the position of a non-zero sample is determined as 
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shown in FIG. lOA. The arrangement is the same as 
that shown on table 77 in FIG. 4 corresponding to the 
above described ITU-T standard G.729. 

On the other hand, when the determination in step 
5 A2 shown in FIG. 9 is NO, it is determined whether or 
not the second set value of lag value is equal to or 
larger than 80 (step A3 in FIG. 9). If the 
determination is NO, the contents of the array data 
smp pos [] are sequentially changed in the for loop 
10 process in the process of controlling the position of 
a non-zero sample in step A5 shown in FIG. 9. Then, 
using the changed array data, the process of entering 
the position of the non-zero sample in step A6 is 
performed . 

15 As a result, when the lag value corresponding to 

a pitch period is larger than 40 and smaller than 80, 
for example, when it is 45, the position of a non-zero 
sample is determined as shown in FIG. lOB. As shown 
in FIG. 11, the arrangement is obtained by adding the 

20 sample positions 40, 42, and 44 replacing the sample 
positions 35, 37, and 39 in the arrangement shown in 
the table in FIG. lOA. 

Practically, if the lag value is, for example, 
45, i = 0, ix = 40, and iy = 0 as initial values, and 

25 (lag -41)/2+l=3, then three sample positions 
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are position-controlled. That is, the operation of 
smp_pos [39 - iy] = ix is performed using ix = 40 and 
iy = 0. In the sample position data smp_pos [39], the 
sample position 40 replaces the sample position 39 . 

5 Then, ix = 42 and iy = 2 are obtained using ix + = 2 
and iy + = 2, the sample position 42 replaces the 
sample position 37 in the sample position data smp_pos 
[37] . Furthermore, using the values ix = 44 and iy 
= 4, the sample position 44 replaces the sample 

10 position 35 in the sample position data smp_pos [35] . 

As described above, when the lag value 
corresponding to the pitch period is larger than 40 
and smaller than 80 according to the present 
embodiment, the sample positions are removed by the 

15 number of samples corresponding to the increase from 
the lag value of 40 so that the positions are 
reconstructed within the range of the lag value, 
thereby reconstructing the positions without changing 
the number of non-zero samples. 

20 When the determination in step A3 shown in FIG. 

9 is YES, the clipping process in step A4 shown in 
FIG. 9 is performed. That is, when the lag value 
exceeds 80 corresponding to the frame length, it is 
insignificant to assign a non-zero sample outside the 

25 range of the frame length. Therefore, when the lag 
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value is clipped at 80, the process of controlling the 
positions of non-zero samples in step A5 shown in FIG. 
9, and the subsequent process of entering the 
positions of non-zero samples in step A6 are 
5 performed. As a result, the positions of non-zero 
samples are determined as shown in FIG- IOC. 

In the above described control process, the 
positions of non-zero samples are reconstructed 
corresponding to the lag value even when the lag value 
10 increases. Therefore, it is possible to maintain the 
number of bits of 17 to be transmitted for a code 
vector index without changing the number of non-zero 
samples . 

FIG. 12 shows the pitch emphasis process 
15 performed by the pitch emphasis filter 17 forming 
parts of the configuration variable code books 11 and 
11' shown in FIGS. 7 and 8. 31 and 34 are coefficient 
units, 32 is an adder, and 33 is a delay circuit. 

In FIG. 12, the transmission function of the 
20 configuration including the coefficient units 31 and 
34, the adder 32, and the delay circuit 33 can be 
expressed by P(z) = a / (1 - pz'^^^). a is the 
coefficient of the coefficient unit 31, p is the 
coefficient of the coefficient unit 34, lag indicates 
25 a lag value. For example, the coefficient a of the 
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coefficient unit 31 is a = 1. 0 in the range of 0 
through (lag - 1), and a = 0. 0 in the range of lag 
through 79. The coefficient p of the coefficient unit 
34 is 1. 0. The coefficients a and p are not limited 
5 to these values, but can be set to other values. 

With the above described circuit configuration, 
when the lag value is smaller than the frame length, 
a sample having the length larger than the value 
corresponding to the lag value in the frame is fed 

10 back from the previous lag value and synthesized. As 
a result, a sequence can be generated in 
synchronization with the pitch period, while 
maintaining the representability of pitch periodicity. 
FIGS. 13 and 14 show the second embodiment of the 

15 present invention based on the principle configuration 
shown in FIGS. 5 and 6. 21 and 21' are configuration 
variable code books, 22 and 22' are gain units, 23 and 
23' are linear prediction synthesis filter 23, 24 is 
a subtracter, 25 is an error power evaluation unit, 

20 26 is a non-zero sample position control unit, 27 is 
a pitch synchronization filter, and 28 is a parameter 
separation unit. 

The entire operation of the configuration 
according to the second embodiment shown in FIGS. 13 

25 and 14 is the same as the operation according to the 



principle configuration described by referring to 
FIGS. 5 and 6. 

The configuration variable code books 21 and 21' 
comprise the non-zero sample position control unit 26 
5 and the pitch synchronization filter 27 as with the 
configuration variable code books 11 and 11' (shown 
in FIGS. 7 and 8) corresponding to the first 
embodiment of the present invention. The 
configuration according to the second embodiment is 

10 different from the first embodiment in that the non- 
zero sample position control unit 26 and the pitch 
synchronization filter 27 input a pitch gain G in 
addition to the lag value 1 corresponding to the pitch 
period as a transmission parameter. 

15 As a lag value corresponding to the pitch period 

computed in the A-b-S process (corresponding to the 
upper half of the configuration shown in FIG. 2) using 
an adaptive code book, the most probable value in the 
search range is selected even when input voice has no 

20 definite pitch period. Therefore, in the region of 
an unvoiced sound or a background sound for which a 
noisy sound source is appropriate, a pseudo-pitch 
period is extracted, and the information about the 
pitch period is transmitted from the coder to the 

25 decoder. In this case, a big pitch gain G indicates 
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a storong pitch periodicity, and a small pitch gain 
G indicates a weak pitch periodicity such as an 
unvoiced sound, a background sound, etc. According 
to the second embodiment of the present invention, a 
5 pitch gain G is adopted as one of the transmission 
parameters . 

FIG. 15 is a flowchart of the operating process 
performed by the non-zero sample position control unit 
26 in the configuration variable code books 21 and 21' 

10 shown in FIGS. 13 and 14. In this flowchart, the 
control processes in steps Bl, B3, B4, B7, B5, and B6 
are the same as the processes in steps Al, A2, A3, A4, 
A5, and A6 in the flowchart shown in FIG. 9 
corresponding to the first embodiment of the present 

15 invention. 

The second embodiment is different from the first 
embodiment in the process performed when the pitch 
gain G is smaller than a predetermined threshold. 
That is, in step B2 shown in FIG. 15, it is determined 

20 whether or not the pitch gain G is smaller than the 
threshold. If the determination is YES, then the 
setting of a pitch period is insignificant, and 
therefore, the lag value is clipped at 80, which 
equals the frame length, and the same process as in 

25 the first embodiment is performed. 
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In "the above described con-trol process, the 
characteristics of the present embodiment can be 
furthermore improved . 

FIGS. 16A through 16C show input voice X 
5 (corresponding to the X shown in FIGS. 16A and 2), 
noisy input signal X' (corresponding to the X' shown 
in FIGS. 16B, 5, etc. ) to the present embodiment, and 
an example of each waveform (FIG. 16C) from the 
configuration variable code book (1 shown in FIG. 5, 

10 etc. ) of the present invention. 

The embodiments of the present invention are 
described above, but the present invention is not 
limited only to the described embodiments, but 
additions and amendments can be made to them. For 

15 example, the frame length, the number of samples, etc. 

can be optionally selected corresponding to an 
applicable system. In addition, a transmission 
parameter corresponding to, for example, the format 
of a vowel can be used. Furthermore, the present 

20 invention can be applied not only to the ACELP system, 
but also to a voice coding system in which a plurality 
of non-zero samples are used and the positions of the 
non-zero samples are controlled using a transmission 
parameter . 
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What is claimed is: 

1. A voice coding method based on analysis-by- 
synthesis vec-tor quantization using a code book 

5 containing a voice source code vector having only a 
plurality of non-zero amplitude values, comprising the 
step of 

variably controlling a position of a sample of 
the non-zero amplitude value using an index and a 
10 transmission parameter indicating a feature amount of 
voice . 

2. The method according to claim 1, further 
comprising the step of 

15 variably controlling the position of the sample 

of the non-zero amplitude value using the index and 
a lag value corresponding to a pitch period which is 
a transmission parameter indicating the feature amount 
of voice. 

20 

3 . The method according to claim 2 , further 
comprising the step of 

reconstructing the position of the sample of the 
non-zero amplitude value within a region corresponding 
25 to the lag value depending on a relationship between 
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the lag value and a frame length which is a coding 
unit of the voice. 



4. The method according to claim 1, further 
5 comprising the step of 

variably controlling the position of the sample 
of the non-zero amplitude value using the index and 
a lag value corresponding to a pitch period which is 
a transmission parameter indicating the feature amount 
10 of voice and a pitch gain value. 

5. The method according to claim 4, further 
comprising the step of 

reconstructing the position of the sample of the 
15 non-zero amplitude value within a region corresponding 
to the lag value depending on a relationship between 
the lag value and a frame length which is a coding 
unit of the voice. 

20 6. The method according to claim 5, further 
comprising the step of 

reconstructing the position of the sample of the 
non-zero amplitude value within a region corresponding 
to the lag value depending on the pitch gain value. 
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7- A voice decoding method for decoding a voice 
signal coded by a voice coding method based on 
analysis-by- synthesis vector quantization using a code 
book containing a voice source code vector having only 
5 a plurality of non-zero amplitude values, comprising 
the step of 

variably controlling a position of a sample of 
the non-zero amplitude value using an index and a 
transmission parameter indicating a feature amount of 
10 voice. 

8. The method according to claim 7, further 
comprising the step of 

variably controlling the position of the sample 
15 of the non-zero amplitude value using the index and 
a lag value corresponding to a pitch period which is 
a transmission parameter indicating the feature amount 
of voice. 

20 9. The method according to claim 8, further 
comprising the step of 

reconstructing the position of the sample of the 
non-zero amplitude value within a region corresponding 
to the lag value depending on a relationship between 

25 the lag value and a frame length which is a coding 
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unit: of "the voice. 

10. The method according to claim 7, further 
comprising the step of 

5 variably controlling the position of the sample 

of the non-zero amplitude value using the index and 
a lag value corresponding to a pitch period which is 
a transmission parameter indicating the feature amount 
of voice and a pitch gain value. 

10 

11. The method according to claim 10, further 
comprising the step of 

reconstructing the position of the sample of the 
non-zero amplitude value within a region corresponding 
15 to the lag value depending on a relationship between 
the lag value and a frame length which is a coding 
unit of the voice. 

12. The method according to claim 11, further 
20 comprising the step of 

reconstructing the position of the sample of the 
non-zero amplitude value within a region corresponding 
to the lag value depending on the pitch gain value. 
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13. A voice coding apparatus based on analysis-by- 
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synthesis vector quantization using a code book 
containing a voice source code vector having only a 
plurality of non-zero amplitude values, comprising 

a configuration variable code book unit variably 
5 controlling a position of a sample of the non-zero 
amplitude value using an index and a transmission 
parameter indicating a feature amount of voice. 

14, The apparatus according to claim 13, wherein 

10 said configuration variable code book unit 

variably controls the position of the sample of the 
non-zero amplitude value using the index and a lag 
value corresponding to a pitch period which is a 
transmission parameter indicating the feature amount 

15 of voice. 

15. The apparatus according to claim 13, wherein 
said configuration variable code book unit 

variably controls the position of the sample of the 
20 non-zero amplitude value using the index and a lag 
value corresponding to a pitch period which is a 
transmission parameter indicating the feature amount 
of voice and a pitch gain value. 

25 16- A voice decoding apparatus for decoding a voice 
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signal coded by a voice coding apparatus based on 
analysis-by-synthesis vector quantization using a code 
book containing a voice source code vector having only 
a plurality of non-zero amplitude values, comprising 
5 a configuration variable code book unit variably 

controlling a position of a sample of the non-zero 
amplitude value using an index and a transmission 
parameter indicating a feature amount of voice. 

10 17. The apparatus according to claim 16, wherein 

said configuration variable code book unit 
variably controls the position of the sample of the 
non-zero amplitude value using the index and a lag 
value corresponding to a pitch period which is a 

15 transmission parameter indicating the feature amount 
of voice. 

18. The apparatus according to claim 16, wherein 

said configuration variable code book unit 
20 variably controls the position of the sample of the 
non-zero amplitude value using the index and a lag 
value corresponding to a pitch period which is a 
transmission parameter indicating the feature amount 
of voice and a pitch gain value. 
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Abstract of the Disclosure 

A gain unit scales a code vector Ci output from 
a configuration variable code book by a gain g after 
5 the positions of non-zero samples are controlled 
according to an index and transmission parameter p. 
A linear prediction synthesis filter input the 
multiplication result, and outputs a regenerated 
signal gACi. A subtracter outputs an error signal E 

10 by subtracting the regenerated signal gACi from an 
input signal X. A error power evaluation unit 
computes an error power according to an error signal 
E. The above described processes are performed on all 
code vectors Ci and gains g. The index i of the code 

15 vector Ci and the gain g with which the error power 
is the smallest are computed and transmitted to the 
decoder . 
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ENTRY OF NON-ZERO SAMPLE POSITION 
for(i=0, fx=0; i<8 ; { 

FIRST SAMPLE Ci]=srap_posCix] ; 
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THIRD SAMPLE Ci]=snip_pos[ix+2] ; 
FOURTH SAMPLE [i]=smp_posCix+3] ; 
FOURTH SAMPLE [i 4-8] =sinp_posCix+4] ; 
ix+=5 
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POSITION CONTROL OF NON-ZERO SAMPLE 
for (1=0, ix=40, iy=0; i<((l ag-4l)/2+1) ; i++){ 
snip_pos[39-iy] = ix; 
ix+=2; 
iy+=2; 



ENTRY OF NON-ZERO SAMPLE POSITION 
for (i=0, ix=0; i<8 ; { 
FIRST SAMPLECi]=smp_pos[ix] ; 
SECOND SAMPLECi]=smp_posCix+l] ; 
THIRD SAMPLECi]=sinp_pos[ix+2] ; 
FOURTH SAMPLE Ci3=smp_posCix+3] ; 
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the following attoniey(s) and/or agent(s) to prosecute this 
application and transact alt business in the Patent and Trademark 
Office connected therewith (Sstnamu and ngistration numbar) 




Aarcn B. KMaS, £teg. Nd. 18,923; Sscasxi HEIfGOTr, 
Reg, Nd. 23,072 anl Lacnard OXPER Reg. Nb.27,625 
Send Correspondence to: 

HELB3CIIT & KMSS, P.C 
Bcpire State Brilriit^, GOth floor 
tfetf Ycric Ifew Ybdc 10118 
Uiited States of AtEcica 






Direct Telephone Calls to: (name andMepAone numbar) 

ffelfgcjtt & Karas, P.C. 
(212) 6<:s3-5000 










FuK name of sde or first inventor 

Yasuji OTA^ 








invento^ssisnaturer^,^ Au^'lt 16, 1999 








Residence 'J 

Kanagawa, Japan 






mn 


Citizenship 

Japan 








Post OfTce Addf«ss 

c/o FUJITSU LIMITED, 1 -1 , ,Kainikodanate 






4-chane, Naxanara-Ku, J^xawasaki-sni, 
Kanagawa 211-8588, Japan 








FuK name of second joint inventor, if any 

Masanao SUZUKI 








Second iiwenlor^signature Date 

p ^.August 16, 199 
m'P^47\^<y Pf^/ 1 . 
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Residence " 

Kanagawa, Japan 






Citizenship 

Japan 








Post Office Address ^ ^ ■■, -, , 

c/o FUJITSU LIMITED, 1 -1 , Kainikodanaks 


. 




4-chome, Nakahara-ku, Kawasaki-shi , 

Kanacrawa 711-B588. .Taran 








(Supply similar information and signature for third and subsequent 
Joint inventors.) 
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Full name of third jAint inventor, if &ny 

Yoshiteru TSUCHINAGA 






Third inventor's signature Date 

;^:W^>'=tf2.- August 16, 19 3 


&. m 




Residence 

Fukuoka , Japan 


@ jg Citizenship 

Japan 






Post Office Addressc/o FUJITSU KYUSHU DIGITAL 
TECHNOLOGY LIMITED, 22-8, Hakataekimae 


3-chome, HaKata-ku, Fukuoka-siii, 
F;akuoka 812-0011. Jamn 






Full name of fourth joint inventor, if any 




Bff 


Fourth inventor's signature Date 






Residence 


m m 




Citizenship 






Post Office Address 










Full name of fifth joint inventor, if any 




BH 


Fifth inventor's signature Date 


ft 0f 




Residence 


m m 




Ci tizenship 






Post Office Address 








Full name of sixth joint inventor. if any 






Sixth inventor's signature Date 






Residence 


S Citizenship 






Post Office Address 






(Supply sinilar information and signature for 
seventh and subsequent joint inventors.) 
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THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re the Application of: Yasuji OTA et al . 
Filed: : Concurrently herewith 

For : VOICE CODING METHOD, VOICE CODING APPARATUS, AND 

VOICE DECODING APPARATUS 

Serial No. : Concurrently herewith 

August 31, 1999 

Assistant Commissioner of Patents 
Washington, D.C. 2 0231 

SUB -POWER OF ATTORNEY 

SIR: 



1, Samson Helfgott Reg. No. 23,072 attorney of record 
herein, do hereby grant a sub-power of attorney to Linda S. 
Chan, Reg. No. 42,400 and Jacqueline /P^i. Steady, Reg. No. 44, 354 




HELFGOTT & KARAS, P.C. 
GOth FLOOR 

EMPIRE STATE BUILDING 
NEW YORK, NY 10118 
DOCKET NO. : FUJ015.446 

LHH power 

Filed Via Express Mail 

Rec. No. : EM366877202US 

On: Auaujgt^l, 1999 

Any fee due as a result of this paper, not covered 
by an enclosed check may be charged on Deposit 
Acct. No. 08-1634. 



